18 research outputs found

    A Deep Neural Network -- Mechanistic Hybrid Model to Predict Pharmacokinetics in Rat

    Full text link
    An important aspect in the development of small molecules as drugs or agro-chemicals is their systemic availability after intravenous and oral administration. The prediction of the systemic availability from the chemical structure of a potential candidate is highly desirable, as it allows to focus the drug or agrochemical development on compounds with a favorable kinetic profile. However, such pre-dictions are challenging as the availability is the result of the complex interplay between molecular properties, biology and physiology and training data is rare. In this work we improve the hybrid model developed earlier [1]. We reduce the median fold change error for the total oral exposure from 2.85 to 2.35 and for intravenous administration from 1.95 to 1.62. This is achieved by training on a larger data set, improving the neural network architecture as well as the parametrization of mechanistic model. Further, we extend our approach to predict additional endpoints and to handle different covariates, like sex and dosage form. In contrast to a pure machine learning model, our model is able to predict new end points on which it has not been trained. We demonstrate this feature by predicting the exposure over the first 24h, while the model has only been trained on the total exposure.Comment: Version accepted by Journal of Computer-Aided Molecular Desig

    Predicting drug metabolism: experiment and/or computation?

    Get PDF
    Drug metabolism can produce metabolites with physicochemical and pharmacological properties that differ substantially from those of the parent drug, and consequently has important implications for both drug safety and efficacy. To reduce the risk of costly clinical-stage attrition due to the metabolic characteristics of drug candidates, there is a need for efficient and reliable ways to predict drug metabolism in vitro, in silico and in vivo. In this Perspective, we provide an overview of the state of the art of experimental and computational approaches for investigating drug metabolism. We highlight the scope and limitations of these methods, and indicate strategies to harvest the synergies that result from combining measurement and prediction of drug metabolism.This is the accepted manuscript of a paper published in Nature Reviews Drug Discovery (Kirchmair J, Göller AH, Lang D, Kunze J, Testa B, Wilson ID, Glen RC, Schneider G, Nature Reviews Drug Discovery, 2015, 14, 387–404, doi:10.1038/nrd4581). The final version is available at http://dx.doi.org/10.1038/nrd458

    Dataset overlap density analysis

    No full text

    Automated Quantum Chemistry for Estimating Nucleophilicity and Electrophilicity with Applications to Retrosynthesis and Covalent Inhibitors

    No full text
    Reactivity scales such as nucleophilicity and electrophilicity are valuable tools for de- termining chemical reactivity and selectivity. However, prior attempts to predict or calculate nucleophilicity and electrophilicity are either not capable of generalizing well to unseen molecular structures or require substantial computing resources. We present a fully automated quantum chemistry (QM)-based workflow that automatically identi- fies nucleophilic and electrophilic sites and computes methyl cation affinities and methyl anion affinities to quantify nucleophilicity and electrophilicity, respectively. The calcu- lations are based on r2SCAN-3c SMD(DMSO) single-point calculations on GFN1-xTB ALPB(DMSO) geometries that, in turn, derive from a GFNFF-xTB ALPB(DMSO) conformational search. The workflow is validated against both experimental and higher- level QM-derived data resulting in very strong correlations while having a median wall time of less than two minutes per molecule. Additionally, we demonstrate the workflow on two different applications: first, as a general tool for filtering retrosynthetic routes based on chemical selectivity predictions, and second, as a tool for determining the relative reactivity of covalent inhibitors. The code is freely available on GitHub under the MIT open source license and as a web application at www.esnuel.org

    RegioML: Predicting the regioselectivity of electrophilic aromatic substitution reactions using machine learning

    No full text
    We present RegioML, an atom-based machine learning model for predicting the regioselectivities of electrophilic aromatic substitution reactions. The model relies on CM5 atomic charges computed using semiempirical tight binding (GFN1-xTB) combined with the ensemble decision tree variant light gradient boosting machine (LightGBM). The model is trained and tested on 21,201 bromination reactions with 101K reaction centers, which is split into a training, test, and out-of-sample datasets with 58K, 15K, and 27K reaction centers, respectively. The accuracy is 93% for the test set and 90% for the out-of-sample set, while the precision (the percentage of positive predictions that are correct) is 88% and 80%, respectively. The test-set performance is very similar to the graph-based WLN method developed by Struble et al. (React. Chem. Eng. 2020, 5, 896) though the comparison is complicated by the possibility that some of the test and out-of-sample molecules are used to train WLN. RegioML out-performs our physics-based RegioSQM20 method (J. Cheminform. 2021, 13:10) where the precision is only 75%. Even for the out-of-sample dataset, RegioML slightly outperforms RegioSQM20. The good performance of RegioML and WLN is in large part due to the large datasets available for this type of reaction. However, for reactions where there is little experimental data, physics-based approaches like RegioSQM20 can be used to generate synthetic data for model training. We demonstrate this by showing that the performance of RegioSQM20 can be reproduced by a ML-model trained on RegioSQM20-generated data

    Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies

    No full text
    We present machine learning (ML) models for hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) strengths. Quantum chemical (QC) free energies in solution for 1:1 hydrogen-bonded complex formation to the reference molecules 4-fluorophenol and acetone serve as our target values. Our acceptor and donor databases are the largest on record with 4426 and 1036 data points, respectively. After scanning over radial atomic descriptors and ML methods, our final trained HBA and HBD ML models achieve RMSEs of 3.8 kJ mol−1 (acceptors), and 2.3 kJ mol−1 (donors) on experimental test sets, respectively. This performance is comparable with previous models that are trained on experimental hydrogen bonding free energies, indicating that molecular QC data can serve as substitute for experiment. The potential ramifications thereof could lead to a full replacement of wetlab chemistry for HBA/HBD strength determination by QC. As a possible chemical application of our ML models, we highlight our predicted HBA and HBD strengths as possible descriptors in two case studies on trends in intramolecular hydrogen bonding.ISSN:1758-294

    Predictive Modeling of PROTAC Cell Permeability with Machine Learning

    No full text
    Approaches for the prediction of PROTAC cell permeability are of major interest to reduce resource-demanding synthesis and testing of low-permeable PROTACs. We report a comprehensive investigation of the scope and limitations of machine learning-based binary classification models developed using simple 2D descriptors for large and structurally diverse sets of CRBN and VHL PROTACs. After construction and internal validation, the models were used for the prediction of blinded sets of PROTACs. For the VHL PROTAC set, kappa nearest neighbor and random forest models succeeded in predicting the permeability with >80% accuracy (k >0.57). Models retrained by combining the original training and the blinded set performed equally well for a second blinded VHL set. However, models for CRBN PROTACs were less successful, mainly due to the highly imbalanced nature of the CRBN datasets. We conclude that properly trained machine learning models can be integrated as effective filters in the PROTAC design process
    corecore